Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
                                            Some full text articles may not yet be available without a charge during the embargo (administrative interval).
                                        
                                        
                                        
                                            
                                                
                                             What is a DOI Number?
                                        
                                    
                                
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
- 
            Free, publicly-accessible full text available June 1, 2026
- 
            This paper presents an extension of Naor’s analysis on the join-or-balk problem in observable M/M/1 queues. Although all other Markovian assumptions still hold, we explore this problem assuming uncertain arrival rates under the distributionally robust settings. We first study the problem with the classical moment ambiguity set, where the support, mean, and mean-absolute deviation of the underlying distribution are known. Next, we extend the model to the data-driven setting, where decision makers only have access to a finite set of samples. We develop three optimal joining threshold strategies from the perspectives of an individual customer, a social optimizer, and a revenue maximizer such that their respective worst-case expected benefit rates are maximized. Finally, we compare our findings with Naor’s original results and the traditional sample average approximation scheme. Funding: This research was supported by the National Science Foundation [Grants 2342505 and 2343869].more » « less
- 
            Problem definition: Data analytics models and machine learning algorithms are increasingly deployed to support consequential decision-making processes, from deciding which applicants will receive job offers and loans to university enrollments and medical interventions. However, recent studies show these models may unintentionally amplify human bias and yield significant unfavorable decisions to specific groups. Methodology/results: We propose a distributionally robust classification model with a fairness constraint that encourages the classifier to be fair in the equality of opportunity criterion. We use a type-[Formula: see text] Wasserstein ambiguity set centered at the empirical distribution to represent distributional uncertainty and derive a conservative reformulation for the worst-case equal opportunity unfairness measure. We show that the model is equivalent to a mixed binary conic optimization problem, which standard off-the-shelf solvers can solve. We propose a convex, hinge-loss-based model for large problem instances whose reformulation does not incur binary variables to improve scalability. Moreover, we also consider the distributionally robust learning problem with a generic ground transportation cost to hedge against the label and sensitive attribute uncertainties. We numerically examine the performance of our proposed models on five real-world data sets related to individual analysis. Compared with the state-of-the-art methods, our proposed approaches significantly improve fairness with negligible loss of predictive accuracy in the testing data set. Managerial implications: Our paper raises awareness that bias may arise when predictive models are used in service and operations. It generally comes from human bias, for example, imbalanced data collection or low sample sizes, and is further amplified by algorithms. Incorporating fairness constraints and the distributionally robust optimization (DRO) scheme is a powerful way to alleviate algorithmic biases. Funding: This work was supported by the National Science Foundation [Grants 2342505 and 2343869] and the Chinese University of Hong Kong [Grant 4055191]. Supplemental Material: The online appendices are available at https://doi.org/10.1287/msom.2022.0230 .more » « less
- 
            Boolean matrix factorization (BMF) has been widely utilized in fields such as recommendation systems, graph learning, text mining, and -omics data analysis. Traditional BMF methods decompose a binary matrix into the Boolean product of two lower-rank Boolean matrices plus homoscedastic random errors. However, real-world binary data typically involves biases arising from heterogeneous row- and column-wise signal distributions. Such biases can lead to suboptimal fitting and unexplainable predictions if not accounted for. In this study, we reconceptualize the binary data generation as the Boolean sum of three components: a binary pattern matrix, a background bias matrix influenced by heterogeneous row or column distributions, and random flipping errors. We introduce a novel Disentangled Representation Learning for Binary matrices (DRLB) method, which employs a dual auto-encoder network to reveal the true patterns. DRLB can be seamlessly integrated with existing BMF techniques to facilitate bias-aware BMF. Our experiments with both synthetic and real-world datasets show that DRLB significantly enhances the precision of traditional BMF methods while offering high scalability. Moreover, the bias matrix detected by DRLB accurately reflects the inherent biases in synthetic data, and the patterns identified in the bias-corrected real-world data exhibit enhanced interpretability.more » « less
- 
            Boolean matrix factorization (BMF) has been widely utilized in fields such as recommendation systems, graph learning, text mining, and -omics data analysis. Traditional BMF methods decompose a binary matrix into the Boolean product of two lower-rank Boolean matrices plus homoscedastic random errors. However, real-world binary data typically involves biases arising from heterogeneous row- and column-wise signal distributions. Such biases can lead to suboptimal fitting and unexplainable predictions if not accounted for. In this study, we reconceptualize the binary data generation as the Boolean sum of three components: a binary pattern matrix, a background bias matrix influenced by heterogeneous row or column distributions, and random flipping errors. We introduce a novel Disentangled Representation Learning for Binary matrices (DRLB) method, which employs a dual auto-encoder network to reveal the true patterns. DRLB can be seamlessly integrated with existing BMF techniques to facilitate bias-aware BMF. Our experiments with both synthetic and real-world datasets show that DRLB significantly enhances the precision of traditional BMF methods while offering high scalability. Moreover, the bias matrix detected by DRLB accurately reflects the inherent biases in synthetic data, and the patterns identified in the bias-corrected real-world data exhibit enhanced interpretability.more » « less
- 
            Matrix low rank approximation is an effective method to reduce or eliminate the statistical redundancy of its components. Compared with the traditional global low rank methods such as singular value decomposition (SVD), local low rank approximation methods are more advantageous to uncover interpretable data structures when clear duality exists between the rows and columns of the matrix. Local low rank approximation is equivalent to low rank submatrix detection. Unfortunately,existing local low rank approximation methods can detect only submatrices of specific mean structure, which may miss a substantial amount of true and interesting patterns. In this work, we develop a novel matrix computational framework called RPSP (Random Probing based submatrix Propagation) that provides an effective solution for the general matrix local low rank representation problem. RPSP detects local low rank patterns that grow from small submatrices of low rank property, which are determined by a random projection approach. RPSP is supported by theories of random projection. Experiments on synthetic data demonstrate that RPSP outperforms all state-of-the-art methods, with the capacity to robustly and correctly identify the low rank matrices when the pattern has a similar mean as the background, background noise is heteroscedastic and multiple patterns present in the data. On real-world datasets, RPSP also demonstrates its effectiveness in identifying interpretable local low rank matrices.more » « less
- 
            Abstract Quantitative assessment of single cell fluxome is critical for understanding the metabolic heterogeneity in diseases. Unfortunately, laboratory-based single cell fluxomics is currently impractical, and the current computational tools for flux estimation are not designed for single cell-level prediction. Given the well-established link between transcriptomic and metabolomic profiles, leveraging single cell transcriptomics data to predict single cell fluxome is not only feasible but also an urgent task. In this study, we present FLUXestimator, an online platform for predicting metabolic fluxome and variations using single cell or general transcriptomics data of large sample-size. The FLUXestimator webserver implements a recently developed unsupervised approach called single cell flux estimation analysis (scFEA), which uses a new neural network architecture to estimate reaction rates from transcriptomics data. To the best of our knowledge, FLUXestimator is the first web-based tool dedicated to predicting cell-/sample-wise metabolic flux and metabolite variations using transcriptomics data of human, mouse and 15 other common experimental organisms. The FLUXestimator webserver is available at http://scFLUX.org/, and stand-alone tools for local use are available at https://github.com/changwn/scFEA. Our tool provides a new avenue for studying metabolic heterogeneity in diseases and has the potential to facilitate the development of new therapeutic strategies.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                     Full Text Available
                                                Full Text Available